Search CORE

19 research outputs found

Giving a Sense: A Pilot Study in Concept Annotation from Multiple Resources

Author: Bojar Ondřej
Sudarikov Roman
Publication venue
Publication date: 01/01/2015
Field of study

We present a pilot study of a web-based annotation of words with senses. The annotated senses come from several knowledge bases and sense inventories. The study is the first step in a planned larger annotation of grounding and should allow us to select a subset of the sense sources that cover any given text reasonably well and show an acceptable level of inter-annotator agreement

Biblio at Institute of Formal and Applied Linguistics

TectoMT – a deep-linguistic core of the combined Chimera MT system

Author: Bojar Ondřej
Hajič Jan
Popel Martin
Rosa Rudolf
Sudarikov Roman
Publication venue
Publication date: 01/01/2016
Field of study

Chimera is a machine translation system that combines the TectoMT deep-linguistic core with phrase-based MT system Moses. For English–Czech pair it also uses the Depfix post-correction system. All the components run on Unix/Linux platform and are open source (available from Perl repository CPAN and the LINDAT/CLARIN repository). The main website is https://ufal.mff.cuni.cz/tectomt. The development is currently supported by the QTLeap 7th FP project (http://qtleap.eu)

Biblio at Institute of Formal and Applied Linguistics

Dictionary-based Domain Adaptation of MT Systems without Retraining

Author: Bojar Ondřej
Novák Michal
Popel Martin
Rosa Rudolf
Sudarikov Roman
Publication venue
Publication date: 01/01/2016
Field of study

We describe our submission to the IT-domain translation task of WMT 2016. We perform domain adaptation with dictionary data on already trained MT systems with no further retraining. We apply our approach to two conceptually different systems developed within the QTLeap project: TectoMT and Moses, as well as Chimera, their combination. In all settings, our method improves the translation quality. Moreover, the basic variant of our approach is applicable to any MT system, including a black-box one

Biblio at Institute of Formal and Applied Linguistics

Using MT-ComparEval

Author: Bojar Ondřej
Burchardt Aljoscha
Klejch Ondřej
Popel Martin
Sudarikov Roman
Publication venue
Publication date: 01/01/2016
Field of study

The paper showcases the MT-ComparEval tool for qualitative evaluation of machine translation (MT). MT-ComparEval is an opensource tool that has been designed in order to help MT developers by providing a graphical user interface that allows the comparison and evaluation of different MT engines/experiments and settings

Biblio at Institute of Formal and Applied Linguistics

On the possibility of reducing man-made burden on benthic biotic communities when mining solid minerals using technical means of various designs

Author: Dmitrii A. Yungmeister
Roman I. Korolev
Sergei M. Sudarikov
Vladimir A. Petrov
Publication venue: 'Saint-Petersburg Mining University'
Publication date: 01/04/2022
Field of study

The paper analyses features of the species composition and diversity of biotic communities living within the ferromanganese nodule fields (the Clarion-Clipperton field), cobalt-manganese crusts (the Magellan Seamounts) and deep-sea polymetallic sulphides (the Ashadze-1, Ashadze-2, Logatchev and Krasnov fields) in the Russian exploration areas of the Pacific and Atlantic Oceans. Prospects of mining solid minerals of the world’s oceans with the least possible damage to the marine ecosystems are considered that cover formation of the sediment plumes and roiling of significant volumes of water as a result of collecting the minerals as well as conservation of the hydrothermal fauna and microbiota, including in the impact zone of high temperature hydrothermal vents. Different concepts and layout options for deep-water mining complexes (the Indian and Japanese concepts as well as those of the Nautilus Minerals and Saint Petersburg Mining University) are examined with respect to their operational efficiency. The main types of mechanisms that are part of the complexes are identified and assessed based on the defined priorities that include the ecological aspect, i.e. the impact on the seabed environment; manufacturing and operating costs; and specific energy consumption, i.e. the technical and economic indicators. The presented morphological analysis gave grounds to justify the layout of a deep-sea minerals collecting unit, i.e. a device with suction chambers and a grip arm walking gear, selected based on the environmental key priority. Pilot experimental studies of physical and mechanical properties of cobalt-manganese crust samples were performed through application of bilateral axial force using spherical balls (indenters) and producing a rock strength passport to assess further results of the experimental studies. Experimental destructive tests of the cobalt-manganese crust by impact and cutting were carried out to determine the impact load and axial cutting force required for implementation of the collecting system that uses a clamshell-type effector with a built-in impactor

Directory of Open Access Journals

Giving a Sense: A Pilot Study in Concept Annotation from Multiple Resources

Author: Ondřej Bojar
Roman Sudarikov
Publication venue
Publication date: 03/04/2020
Field of study

Abstract: We present a pilot study in web-based annotation of words with senses coming from several knowledge bases and sense inventories. The study is the first step in a planned larger annotation of "grounding" and should allow us to select a subset of these "dictionaries" that seem to cover any given text reasonably well and show an acceptable level of inter-annotator agreement

CiteSeerX

MT-ComparEval

Author: Klejch Ondřej
Popel Martin
Sudarikov Roman
Publication venue
Publication date: 01/01/2015
Field of study

MT-ComparEval is a tool for Machine Translation developers, which allows to compare and evaluate different MT systems (and their versions). MT-ComparEval includes several automatic MT evaluation metrics

Biblio at Institute of Formal and Applied Linguistics

TeamUFAL: WSD+EL as Document Retrieval

Author: Bojar Ondřej
Fanta Petr
Sudarikov Roman
Publication venue
Publication date: 01/01/2015
Field of study

This paper describes our system for SemEval- 2015 Task 13: Multilingual All-Words Sense Disambiguation and Entity Linking. We have participated with our system in the sub-task which aims at monolingual all-words disambiguation and entity linking. Aside from system description, we pay closer attention to the evaluation of system output

Biblio at Institute of Formal and Applied Linguistics

CUNI-LMU Submissions in WMT2016: Chimera Constrained and Beaten

Author: Bojar Ondřej
Fraser Alexander
Sudarikov Roman
Tamchyna Aleš
Publication venue
Publication date: 01/01/2016
Field of study

This paper describes the phrase-based systems jointly submitted by CUNI and LMU to English-Czech and English-Romanian News translation tasks of WMT16. In contrast to previous years, we strictly limited our training data to the constraint datasets, to allow for a reliable comparison with other research systems. We experiment with using several additional models in our system, including a feature-rich discriminative model of phrasal translation

Publikationsserver der RWTH Aachen University

Biblio at Institute of Formal and Applied Linguistics

Verb Sense Disambiguation in Machine Translation

Author: Bojar Ondřej
Dušek Ondřej
Holub Martin
Kríž Vincent
Sudarikov Roman
Publication venue
Publication date: 01/01/2016
Field of study

We describe experiments in Machine Translation using word sense disambiguation (WSD) information. This work focuses on WSD in verbs, based on two different approaches -- verbal patterns based on corpus pattern analysis and verbal word senses from valency frames. We evaluate several options of using verb senses in the source-language sentences as an additional factor for the Moses statistical machine translation system. Our results show a statistically significant translation quality improvement in terms of the BLEU metric for the valency frames approach, but in manual evaluation, both WSD methods bring improvements

Biblio at Institute of Formal and Applied Linguistics